Expression, characterization, and activity optimization of a novel cellulase from the thermophilic bacteria Cohnella sp. A01

Cellulases are hydrolytic enzymes with wide scientific and industrial applications. We described a novel cellulase, CelC307, from the thermophilic indigenous Cohnella sp. A01. The 3-D structure of the CelC307 was predicted by comparative modeling. Docking of CelC307 with specific inhibitors and molecular dynamic (MD) simulation revealed that these ligands bound in a non-competitive manner. The CelC307 protein was purified and characterized after recombinant expression in Escherichia coli (E. coli) BL21. Using CMC 1% as the substrate, the thermodynamic values were determined as Km 0.46 mM, kcat 104.30 × 10–3 (S−1), and kcat/Km 226.73 (M−1 S−1). The CelC307 was optimally active at 40 °C and pH 7.0. The culture condition was optimized for improved CelC307 expression using Plackett–Burman and Box–Behnken design as follows: temperature 20 °C, pH 7.5, and inoculation concentration with an OD600 = 1. The endoglucanase activity was positively modulated in the presence of Na+, Li+, Ca2+, 2-mercaptoethanol (2-ME), and glycerol. The thermodynamic parameters calculated for CelC307 confirmed its inherent thermostability. The characterized CelC307 may be a suitable candidate for various biotechnological applications.

The increasing global energy demand, coupled with the depletion of fossil reserves and frightening environmental effects in the context of climate change, has directed the scientific interests toward greener alternative energy resources 1 . Photosynthesis produces about 150-170 × 10 9 tons of lignocellulosic biomass, the most abundant renewable biomaterial, annually. Bioconversion of this biomass can conquer the global energy crises in an eco-friendly way. Lignocellulosic wastes are mainly composed of 30-50% cellulose, 15-35% hemicellulose, and 10-20% lignin 2 . Cellulose is an organic polymer made from simple monosaccharides and further degraded into short polysaccharides, simple sugars, and then into biofuels by a set of enzymes called cellulase. Industrial production of these enzymes found a broad spectrum of applications in many fields. Moreover, the increasing interest in converting lignocellulosic biomass to fermentable sugars for the production of bioethanol, promising alternative petrol, has generated a growing request for cellulases and their related enzymes 3,4 . Depending on the mode of action and the substrate specificity, the cellulose hydrolyzing enzymes are allocated into three main classes; [I] Exoglucanase or cellobiohydrolases (CBH) (EC 3.2.1.91) that acts on the reducing or non-reducing ends of cellulases and cleaves the cellobiose from glucan chains, [II] Endoglucanase (EG) (EC 3.2.1.4) that randomly cut β-1,4-glycosidic bonds of the cellulose string, and providing more 'ends' for exoglucanase, and [III] β-glucosidase (EC 3.2.1.21) which resolve cellulose and short-chain oligosaccharides to liberate glucose 5 . Despite being found in different members of the animal and plant kingdoms, cellulases are produced chiefly by fungi, bacteria, and protozoans that decompose cellulosic materials 6 . The thermal stability of an enzyme is a crucial characteristic that is desired in many industrial processes. Thermostable enzymes can increase the reaction rate in thermal conditions, where higher temperatures leads to reduce the substrates viscosity and increase the solubility 7 . Production of industrial enzymes using their native host has its own limitations and difficulties. Recombinant DNA technology has been employed to overcome the drawbacks of the economical production and characterization of various enzymes in a large-scale and regulated manner 7 . This paper describes the expression Figure 2. Phylogenetic analysis of CelC307. Unrooted circular phylogenetic tree of CelC307 from Cohnella sp. A01. The tree was constructed using MEGA 8.0 with the neighbor-joining method and it was modified and painted by Adobe Illustrator CS6. The CelC307 is marked with a green rectangle. Human endoglucanase (UniProt accession No. Q8N909) was used as the outgroup and is marked with a red rectangle. The color-coded branches correspond to the significant clusters. www.nature.com/scientificreports/ www.nature.com/scientificreports/ Docking results. Here we found the inhibitory effect of low concentrations of monosaccharides on the CelC307, and three of them (maltose, fructose, and lactose) were used for the docking study. The molecular docking analysis predicted amino acids involved in CelC307-ligands interaction (Fig. 4a,b,d,e,g,h). Analysis of the docking results of the binding energy (after 50 ns of MD simulation) showed that the largest contribution of negative binding energy is reserved for amino acids that participate in hydrogen bonding with ligands. In contrast, the contribution of active site amino acids is very low. The results indicated that all three maltose, fructose, and lactose ligands contact with at least one Asp residue of CelC307 (Asp463 in CelC307-maltose complex, Asp3 in CelC307-fructose complex, and Asp412 in CelC307-lactose complex) by hydrogen bond in aqueous solutions (Fig. 4c,f,i). This may indicate the importance of Asp463, Asp3 and Asp412 in CelC307-sugars interactions.
Molecular dynamics and MM/PBSA analysis. The docking results were simulated by MD to check the enzyme inhibitors' stability and have atomically detailed structures. The low computed root mean square deviation (RMSD) values of docked CelC307-maltose, CelC307-lactose, and CelC307-fructose (0.42, 0.51, and 0.35 nm, respectively) ( Fig. 5a) imply that the constructed models are robust and reliable for further analysis. To calculate the residual and side-chain flexibility, root means square fluctuation (RMSF) and radius of gyration (Rg) were calculated over 50 ns. The results demonstrated that all three sugars could destabilize the CelC307 structure; however, the less structural fluctuation was observed for CelC307-fructose when compared with the two other complexes (Fig. 5b). The Rg was calculated at 2.25 nm for CelC307 in complex with maltose and lactose. In comparison, the value was decreased in the CelC307-fructose complex (2.2 nm), which refers to the lower potential of fructose in CelC307 destabilization (Fig. 5c). However, it is assumed that the difference in Rg is not significant between complexes.
The binding free energy (kJ/mol) and corresponding components computed from the MM/PBSA analysis of CelC307-sugars complexes were calculated ( Table 1). The obtained lower ΔG binding supported the MD results for the CelC307-fructose complex, indicating the more favored energetically binding between fructose and CelC307. The MM/PBSA analysis results indicated that the electrostatic and then van der Waals interactions are the main contributor to binding in all three complexes. These calculated binding free energies support the corresponding results of MD simulation.
Each amino acid's contribution to the binding energy is shown in Fig. 5d,e,f. It was found that the highest negative energy belonged to amino acids involved in complex interactions as well as the lowest negative energy belonged to CelC307 active site amino acids. These results support the idea of non-competitive inhibition of selected sugars.
Heterologous expression, purification, and protease resistance of CelC307. The amplified gene encoding CelC307 was cloned in the pET26b(+) vector under the control of the T7 promoter and with a His-tag. The cloned gene was confirmed by restriction enzyme digestion, colony PCR, and sequencing with gene-specific primers. The results of PCR and double digestion are provided in Supplementary Fig. S2. The SDS-PAGE analysis of the E. coli intracellular crude extract revealed the accumulation of recombinant CelC307 in the soluble form and with the approximate molecular weight of 56 kDa. The CelC307 enzyme purification was conducted using chromatographic and non-chromatographic methods; His-tag affinity to Ni-NTA and the single-step thermal shock. The homogenous purified enzyme, seen as a single protein band on SDS-PAGE, was obtained using both methods ( Fig. 6a and Supplementary Fig. S3). Data on protein purification, including specific activity and purification fold, are summarized in Table 2.
The purified CelC307 enzyme was mixed with proteinase K and trypsin and analyzed on the SDS-PAGE. 0.2% BSA was used as a control. While trypsin had a minimal effect on the CelC307, the digestion product of proteinase k was smeared on the SDS-PAGE. Both proteinase k and trypsin completely digested albumin ( Fig. 6b and Supplementary Fig. S4).
Substrate specificity and kinetic analysis. Cellulase activity was both quantitatively and qualitatively assessed. The Zymography test confirmed cellulase activity in the native-PAGE by creating an orange halo in the presence of CMC 1% substrate and Na + 1 M ( Fig. 6c and Supplementary Fig. S5). The value of cellulase activity was determined using CMC 1% substrate and detecting DNS color intensity ( Fig. 6d and Supplementary  Fig. S6). The highest levels of activity were observed for CMC substrate (β-1 → 4 linkage) (100 ± 3.3%), followed by laminarin (21.6 ± 2.4%), chitin (8.4 ± 1.3%), pectic acid (1.9 ± 0.6%), and pustulan (NA) substrates. The kinetic parameters of CelC307 were computed through the Michaelis-Menten equation and using the CMC as the substrate. The values of K m , k cat , and k cat /K m were determined 0.46 mM, 104.30 × 10 -3 S −1 , and 226.73 M −1 S −1 , respectively. The kinetic parameters of CelC307 are compared with some of the other studied members of glycoside hydrolase family in Table 3.
Optimization of culture condition.
In the first step of optimization, Plackett-Burman statistical method was used to find the most significant variables to obtain high cellulase activity (Supplementary Table S1). The obtained data showed that pH, inoculum solution concentration, and temperature positively affected the enzyme's cellulolytic activity. The Pareto chart representing the main effects of variables is shown in Fig. 6e. These three variables were further subjected to the Box-Behnken design to optimize their magnitude, and other variables were excluded from the optimization. Using the Box-Behnken method, 20 sets of experiments with appropriate combinations of pH, temperature, and inoculation cell mass were conducted ( Table 4). The cellulolytic enzyme activity (U/ml) was the dependent response variable. The highest cellulase activity in the Box-Behnken design was known to be 56 www.nature.com/scientificreports/ and pH 7.5, and inoculation cell with an OD 600 = 1. The mean observed and predicted responses for cellulose activity revealed that these data are in rational agreement and did not significantly differ. A second-order polynomial equation was fitted to the response data obtained from the design, which resulted in the following regression equation [Eq. (14)]:  www.nature.com/scientificreports/ The statistical significance of Eq. (14) was checked by the F-test and the analysis of variance (ANOVA) for the fitted ( Table 5). The F-value of the model (15.34) and the associated p-value (p < 0.001) meant the regression model was significant. The F-value (0.6572) for the lack of fit was insignificant (p = 0.432), which confirmed the validity of the model. According to the p-values, the linear coefficients (B and C), two of the quadratic term coefficients (A 2 , C 2 ), and one of the interaction coefficient (BC) had significant differences (p < 0.05), while other term coefficients were not significant. Figure 6f indicates the interaction effects of each of the two variables on the measured cellulase activity. To verify the predicted model, a culture medium with optimized predicted concentrations of independent variables was prepared. The maximum measured cellulase activity was found to be 62.58 U/ml, which authenticated cellulase activity of 58.4 U/ml as predicted by the RSM, and was ~ 3.1 fold more than the cellulolytic activity obtained in the basal condition. The obtained 'Prob > F' value for the model was < 0.0001, showing the model was statistically significant with a confidence interval of 99.99%. The results of a validation experiment under the optimum conditions verified the model predictions.
Temperature and pH profile of activity and stability of CelC307. The maximum enzyme activity was detected at 40 °C with a specific activity of 13.6 U/mg. The assay of survival ability of CelC307 indicated that pre-incubation of the enzyme in temperatures higher than 50 °C, prior to its examination, led to a considerable decrease in its activity (Fig. 7a). Figure 7b exhibits the thermostability of the CelC307. The enzyme preserved more than 95% and 75% of its activity after thermal treatment at 60 and 70 °C, respectively. However, sharp declines were observed then after.
The activity and stability-pH profiles of CelC307 are shown in Fig. 7c and d. The enzyme exhibited optimal activity at pH 7. The pH survival profile of the enzyme was quietly compatible with the pH activity profile. In acidic and basic pH, the enzyme showed a considerable decay of activity.
Thermodynamic analysis of the CelC307. The Arrhenius equation was used to calculate the activation energy of CelC307 (Fig. 7e). The values of thermodynamic parameters are shown in Table 5. The CelC307 needs 25.36 kJ/mole energy to activate the reaction (E a ‡ ), which indicates that the reaction is rapid. Low ΔG ‡ levels in CelC307 indicate the faster reaction for CelC307. On the other hand, the higher values of ΔH ‡ and ΔS ‡ at the optimum temperature emphasize the efficient transition state of CelC307. Also, compared to ΔG ‡ E−S (2.01), lower values of ΔG ‡ E−T (− 14.11) verified that a small value of energy can form the CelC307 TS at 40 °C, which suggests the possibility of the spontaneity of the reaction.
The CelC307 had very low k in and, on the other hand, high t 1/2 and D value at the optimum temperature ( Fig. 7f,g). The highest t 1/2 was obtained at 40 °C, but a considerable decrease and an increase were observed in the t 1/2 and D values of higher temperatures, respectively.
The high E a # value (59.29 kJ/mol) calculated for CelC307 confirmed that a high energy quantity is required for its irreversible thermal inactivation that is due to CelC307 inherent thermal stability.
The amount of ΔG # for CelC307 increased with the temperature, indicating the enzyme's resistance to irreversible thermal inactivation and denaturation, as well as the transition state that occurs later.
Enthalpy (ΔH # ) and entropy (ΔS # ) changes for deactivation of indicated decreasing trends with increasing temperature from 40 °C to 90 °C. The lower values of ΔH # and ΔS # at optimum temperature revealed a higher thermal stability of CelC307 at this temperature compared with the higher ones. ΔS # , ΔH # , and ΔG # values revealed the thermophilic character of the CelC307 and indicated that this enzyme in the optimum temperature has a stable conformation. The thermodynamic characteristics of CelC307 are listed in Table 6.
The thermodynamic characteristics of CelC307 are compared with other studied endoglucanases in Table 7.
Effects of metal ions, denaturing, inhibitors, detergents, organic solvent, surfactants, and Na + on the activity of CelC307. We examined a list of potential activators or inhibitors of cellulases for their effects on CelC307 ( Fig. 8a-f). Here, we found the significant inhibitory effect of most tested metal ions, especially ZnSo 4 , Fe 3+ , Fe 2+ , and MnSO 4 . However, Na + and Ca 3+ at a concentration of 5 mM, and Li + at both tested concentrations (5 and 10 mM) slightly increased the activity of CelC307. GuHCl as a denaturing agent displayed the highest inhibitory effect on the enzyme among the tested general enzyme inhibitors, while 2-ME showed the stimulatory effect. SDS was the only tested surfactant that completely inhibited the enzyme activity. In organic solvent, glycerol increased the CelC307 activity in concentrations of Table 1. Binding free energies and their corresponding components, obtained from MM/PBSA analysis of the CelC307 complexes with maltose, lactose, and fructose. All values are in kJ/mol. *ΔE vdW van der Waals interaction energy, **ΔE elec electrostatic interaction energy, ***ΔG polar polar solvation energy. www.nature.com/scientificreports/ www.nature.com/scientificreports/ 5% and 20%. Increasing the concentrations of specific cellulase inhibitors up to 10 mM completely stopped the enzyme activity. The CelC307 was found to well tolerate the low concentration of organic solvents (< 5%), and its activity was enhanced by up to 20% glycerol.

Kinetic analysis of the specific inhibitors' mode of inhibition and inhibition model. The inhi-
bition mode of the CelC307 enzyme was measured with two concentrations (0.5 and 1 mM) of three substrates with the highest inhibitory effects; fructose, lactose, and maltose (Fig. 8g,h,i). Michaelis-Menten and Lineweaver-Burk plots were applied to determine the mode of CelC307 inhibition.
According to the obtained data, the inhibition mode of maltose was non-competitive because the V max was reduced during the addition of inhibitor, while the x-axis and slope lines of the Lineweaver-Burk plot cross at the same point, and the K m remained unchanged. This result indicated that the maltose could bind both to the free CelC307 enzyme and the enzyme-substrate complex. Furthermore, the fructose and lactose exhibited the mixed-type inhibition mode since, in the Lineweaver-Burk plots, the slopes cross at different points. Furthermore, the K m and V max values of the control and two concentrations of inhibitors were different. These results demonstrated that fructose and lactose could bind the enzyme if the substrate had already been bound.

Discussion
Enzymes can generate biofuels-the fuels of the future-faster and cheaper than conventional chemical methods. Besides other glycolytic enzymes, cellulases act on cellulosic substrates to produce a cocktail of carbohydrates that can be converted to ethanol for biofuel. The increasing industrial demands for cellulases led to a growing interest in the engineering of existing cellulases and screening new sources to produce enzymes with improved performance. A broad range of microorganisms, including fungi, bacteria, and actinomycetes, have been recorded www.nature.com/scientificreports/ to be efficient cellulase enzyme producers. So, nature has gifted us a large number of cellulase enzyme sources with different characteristics and for various applications [36][37][38] .
In the present study, we identified the gene of a new cellulase from thermophilic indigenous Cohnella sp. A0.1, named CelC307, and determined the enzyme characterization using both experimental and computational approaches. In silico analysis of the translated sequence characterized CelC307 as an intracellular protein with thermostable, hydrophilic, and acidic nature. Multiple sequence alignment and phylogenetic analysis showed the high similarity between CelC307 amino acid sequence with glycoside hydrolase from Paenibacillaceae bacterium, classified in GH5 superfamily. GH5 is the first described and one of the most crowded GH families, classified by Aspeborg et al., into 51 distinct subfamilies 39 . GH5 members are found widely distributed in archaea, bacteria, and eukaryotes, and many enzyme activities relevant to biomass conversion have been found in this superfamily. GH5 enzymes use the classical Koshland two-step double-displacement mechanism for the catalysis, with the two catalytic residues (glutamates and histidine) at the C-terminal ends of β-strands 4 and 7 40 .
Commonly, cellulases have modular structure, containing a cellulose-binding module (CBM) and a core, catalytic domain (CD) linked via a flexible, frequently glycosylated linker 41 . According to the Conserved Domains server, the Celc307 CD is predicted to contain amino acid regions 77-362, and CBM is likely to be found in the enzyme's C-terminal region. The hydrolyzing activity of CelC307 is attributed to several amino acids in the active site, including tyrosine that plays a role in binding to the sugar chain, and aspartic acid and glutamic acid that are catalytic amino acids responsible for glycosidic bond cleavage (nucleophilic attack). Furthermore, it has been reported that arginine, asparagine, and histidine are among the highly conserved amino acids at the active site of cellulases GH5 42,43 . These motifs probably provide the enzyme for easy access to the substrate 44,45 .
It was shown that high concentrations of individual monosaccharides (mannose, glucose, galactose, xylose, and fructose) have a non-competitive inhibition effect on the hydrolysis of commercial cellulase cocktails 46 . Docking results showed three ligand molecules bind at different places other than the active site that suggested the non-competitive inhibition mechanism. Furthermore, ligands contact with at least one Asp residue of CelC307(Asp463, Asp3 or Asp412) by hydrogen bond in every three complexes may indicate that Asp in CelC307-sugars interactions is a key amino acid.
The overall results of MD simulation showed the CelC307 protein instability in the presence of all three tested sugars after 20 ns; however, the lower calculated RMSD, RMSF, and Rg indexes for CelC307-fructose implied the lower instability for CelC307-fructose complex. The MM/PBSA calculations were carried out to determine the binding free energy between CelC307 with the three selected sugars. It was found that the electrostatic and van der Waals interactions have principle contribute to the formation of interactions in all three complexes. Hydrogen bonding is the main stabilizing force in protein stability and plays a key role in maintaining structural integrity. The amino acids involved in hydrogen bonding showed the largest negative free energy, while those that were predicted in the active site showed the lowest contribution in binding. These results support the idea of non-competitive inhibition of selected sugars. www.nature.com/scientificreports/ www.nature.com/scientificreports/ The CelC307 enzyme purification was conducted using chromatographic and non-chromatographic methods; His-tag affinity to Ni-NTA and the single-step thermal shock. However, the single-step thermal shock is fast, easy, cost-effective, and interestingly led to a remarkable higher yield for this purified protein. In the previous study, we purified the protease 1147 from the same microorganism using the single-step thermal shock with ~ 73% efficiency 47 . So, it seems this simple introduced method could be considered a practical technique for isolating thermostable protein, especially for industrial enzymes.
We examined the effects of two proteases (trypsin and proteinase k) on CelC307. It was found that CelC307 is well resistant to trypsin, while proteinase k digested the CelC307 completely. Therefore, in the case of using this enzyme in animal feed, trypsin is suggested as the supplemented protease. Studies have shown that the use of exogenous enzymes in animal diets may be influenced by ruminal proteolytic enzymes. Since the use of cellulases in ruminants diets such as cattle is very common, stability against proteolytic digestion can be a significant benefit for exogenous cellulase enzymes 48,49 . An important application of cellulases enzymes is in the animal feed industry, which, alongside other enzymes, including proteases, improves feed utilization and animal performance 50,51 .
CelC307 showed the highest cellulolytic activity against soluble CMC substrate (β-1 → 4 linkage) used to calculate kinetic parameters. Compared to some other studied cellulases 19,52-55 , CelC307 showed low K m and high k cat /K m , indicating a high preference of the enzyme for CMC that led the enzyme to hydrolyze β-1 → 4 linkages more rapidly than other evaluated cellulases.
Economic aspects are the main barrier to the enzymes' application on the large-scale. To optimize the production processes, the other aim of this study was to evaluate the parameters involved in the production of recombinant CelC307 from E. coli using statistical designs. We used the Plackett-Burman, and Box-Behnken design to provide accurate results for increasing the production of recombinant CelC307. Compared to other RSM designs, Box-Behnken has three levels and requires fewer experiment runs, making it easier to arrange and interpret 56 . The successful application of RSM methodology for optimizing cellulase enzymes has been reported in different studies 57,58 . Jeya et al. used the statistical experimental design to optimize the hydrolysis parameters (such as temperature and pH) of cellulase from fungus Trametes hirsuta and achieve an ~ 18% increase in the saccharifying rice straw 59 . www.nature.com/scientificreports/ Overall, the results suggest that CelC307 can tolerate a broad range of temperature and pH. Fermentation of lignocellulose materials in the bioethanol industry and alcoholic fermentation usually proceeds at low temperature and natural pH (37 °C and pH 7) 60 , which correspond to the features of CelC307.
Furthermore, neutral and alkaline cellulases find application in many other industries such as food, brewery, and wine. Most isolated cellulases with fungal sources have acidic pH optima, so screening for neutral and alkaline cellulases, usually from bacterial sources, is of great interest for biotechnological research 61 .
Thermodynamic relationships were used to demonstrate the enzyme's ability to maintain activity at high temperatures and the inherent structural stability of CelC307. The E a of the enzyme was obtained by calculating the thermodynamic parameters of conversion [S] and [E] to [ES]. The amount of energy needed to start a reaction is called E a . The lower the energy of the reacting molecules, the faster the reaction occurs. The spontaneity of the catalytic reaction is demonstrated by ΔG and the effectiveness of the transition state (TS) by ΔH and ΔS 62 . In general, the thermodynamic parameters of ES complex formation showed that CelC307 had lower values of E a ‡ , ΔG ‡ , and ΔG ‡ E-T compared to ΔH ‡ , ΔS ‡ , and ΔG ‡ E-S . These results proposed that the CelC307 reaction is likely to be faster and has an efficient transition state. Similar results have been reported in two studies related to the endoglucanase cmc-1 and CBS31 obtained from Aspergillus oryzae and Bacillus subtilis 10,31 .
The basis for calculating the thermodynamic parameters of irreversible thermal inactivation is to obtain the k in , t 1/2 , and D value. A decrease in T 1/2 and D value and an increase in k in during the temperature rise is a consistent hypothesis about thermostable enzymes. This fact has also been demonstrated in several heat-resistant endoglucanases 33,35 .
Parameters of the irreversible thermodynamic can be determined by the transition state (T) calculation of protein. It is assumed that irreversible denaturation of proteins is a two-step reaction: N ↔ U → I, that the transition state refers to a protein formed between N (native state) and U (reversible/partially unfolded state), or I (irreversible/inactivated state) 63 . A key factor in understanding an enzyme's thermal capacity is calculating the activation energy needed for an enzyme's irreversible thermal inactivation.
CelC307 showed a high E a # and ΔG # value in its irreversible thermal inactivation process. In the thermodynamic study of three thermophilic endoglucanases from Aspergillus oryzae, Macrotermes subhyalinus, Aspergillus fumigatus, and a β-glucosidase enzyme from Aspergillus niger, a similar increase in ΔG # was observed in the transition state with the increasing temperatures [31][32][33]64 . On the other hand, the enzyme's stability and resistance against the denaturation process are associated with maximum ΔS # and ΔH # at the optimum operating temperature 65 . A decreasing trend of ΔH # and ΔS # at higher temperatures indicates the change in CelC307 structure toward the transition state 66 . However, high values of ΔG # and E a # indicate that the CelC307 requires a large amount of thermal inactivation energy for denaturation, so, CelC307 resists the transition state occurrence. www.nature.com/scientificreports/ Comparable results in transition state values (E a # and ΔG # ) have been observed for cellulase enzyme 34 . In general, these findings indicate the thermal resistance of CelC307, in the TS phase, at high temperatures. These results can provide an insight that shows the stability of CelC307 at the TS phase can be inherited from its catalytic efficiency 66 .
The enzyme activities may be affected by different reagents employed in different industrial processes or may be formed due to equipment erosion or corrosion 67 . So, identifying the knowledge about the CelC307 activators and inhibitors is quite relevant, essentially for further evaluation of its industrial application. Metal ions exert their effect by associating with enzymes or making a complex with other molecules linked to enzymes serving as electron donors or acceptors, Lewis acids, or structural regulators 68 . It seems that ions affect cellulases from various sources differently. However, similar results were reported for the endoglucanase isolated from Aspergillus terreus 69 . The observed decreased enzyme activity in the presence of most of the examined metal ions suggested that CelC307 is not a metalloenzyme, however, its activity is affected by metals. The results showed that Triton X-100, Tween 20 and Tween 80 increased CelC307 activity. Surfactants have a great affinity for interphases due to their dual nature and amphiphilic structure, which is the basis of the methods by which surfactants impact the physicochemical properties of proteins 70,71 . Tween 20 is supposed to act as a chemical chaperone, assisting in protein folding by direct hydrophobic interactions with proteins and altering protein interactions with surfaces 72 .
Glycerol has been known as the enzyme stabilizing agent that can shift the native protein ensemble to a more compact state. It also can inhibit protein aggregation by preventing protein unfolding 73 . This result is quite notable, considering the potential application of CelC307 for processes that perform in the presence of organic solvents 74 . For instance, the pretreatment of biomass with crude glycerol was reported to improve the enzymatic hydrolysis of cellulose 75 . The addition of 2 and 5 mM of EDTA had a slight inhibitory effect on the CelC307 activity. Similar observations were also made for CelRH5, and C67-1 derived from rhizosphere and buffalo rumen (both from the GH5 family) 76,77 .
Overall, CelC307 seems to be a stable enzyme with no substantial reduction in its activity upon adding some well-known enzyme inhibitors, including PMSF, urea, and 2ME, and non-ionic surfactants. A significant reduction in the CelC307 activity was only observed by the addition of GuHCl. Also, the anionic surfactant SDS completely inactivated CelC307 even at low concentrations (2 mM), which was reported previously for other cellulases GH5 family 76 .
Supplementation of media with different Na + concentrations indicated that CelC305 remained 70% of its activity up to a 50 mM Na + concentration, and a low concentration of Na + (5 mM) has an enhancing effect on its activity. Also, Na + resistance has been reported in an acidic cellulase from buffalo rumen metagenome (20 mM) and endoglucanase of extreme halophilic Haloarcula sp. CKT3 (3 M) 78,79 . Salt tolerance is a special feature of this novel endoglucanase that could indicate the potential of using CelC307 in the food industry.
Overall, the thermostable CelC307 can well tolerate the presence of various additives used in different industries, making this novel isolated cellulase a promising candidate for further research and industrial applications.

Conclusion
This study reports a novel cellulase (CelC307) from the thermophilic indigenous Cohnella sp. A01 that efficiently catalyzes the hydrolysis of CMC substrate (β-1 → 4 linkage). The production of recombinant CelC307 was optimized using RSM methodology and reached 62.58 U/ml. The thermodynamic parameters calculation verified the inherent thermostability of CelC307. The molecular docking and MD analysis suggested that inhibitors bind at different places other than the active site and in the non-competitive inhibition mechanism. According to the characterization results, CelC307 could be considered a promising candidate for further research and industrial applications.

Methods
Strains, vector, and chemical reagents. High pure DNA and vector purification kit were supplied from Peqlab and Roche (Germany). High pure PCR product purification kit was purchased from Bioneer (Korea). Vector pET26b(+) and enzymes XohI, NdeI were gained from Fermentas (Germany). The E. coli DH5 α BL21 (DE3) strains were obtained from Invitrogen (USA). Carboxymethyl cellulose (CMC) and other chemicals were supplied from Merck (Germany).

In silico structural, functional, phylogenetic analysis and homology modeling. The genome of
Cohnella sp. Strain A01 was sequenced previously 80 , and the predicted sequence for endoglucanase cellulase C307 (CelC307) encoding gene was used in this study (GenBank accession No. MN105992.1). The Expasy service (http:// web. expasy. org/ trans late/) was applied for translating the nucleotide sequence, and the ProtParam (https:// web. expasy. org/ protp aram/) for the determination of physicochemical features of the predicted protein.
The family classification of CelC307 was performed according to the CAZy database 81 . The highly conserved amino acid sequence of CelC307 with the other homologous enzymes was identified using DNAman software (https:// www. lynnon. com/). The neighbor-joining phylogenetic tree was drawn by MEGA 8.0 (https:// www. megas oftwa re. net/), and human endoglucanase (UniProt accession No. Q8N909) was used as the outgroup. www.nature.com/scientificreports/ I-TASSER server was used to build CelC307 3D structure by providing amino acids sequence 82 . Endoglucanase of Candida albicans (PDB accession No. 1EQC) was known as the template crystallographic model as the best protein template for the construction of CellC307 3D. The generated model was refined in the ModRefiner service 83 . The information about the active site of the CelC307 enzyme was obtained from COFACTOR 84 . The alignment of PDBs was made by TM-align 85 . Functional domain of CelC307 was identified via Conserved Domains server (.https:// www. ncbi. nlm. nih. gov/ Struc ture/ cdd/ wrpsb. cgi).
Docking of CelC307-inhibitors. The modeled CelC307 was docked against three cellulase-specific inhibitors (maltose, fructose, and lactose), which showed the highest inhibitory effect on CelC307 enzymatic activity. All of the surrounding water molecules were removed, and an energy minimization step was performed to obtain a stable CelC307 structure. The PubChem server (https:// pubch em. ncbi. nlm. nih. gov/) was used to retrieve the chemical structures of fructose, lactose, and maltose, and AutoDock Vina was used for docking. The inhibitors were saved in the Mol2 format and were docked at the appropriate box with an adjusted complex grid size of 80 × 60 × 50 Å. Complexes were run and saved in the PDBQT formats, and each complex was analyzed by the ViewDock command in the Chimera V1. 13 Molecular dynamics simulation and MM/PBSA calculation. The prepared CelC307-ligand complexes were simulated using the molecular dynamics method using the open-source molecular dynamic package, GROMACS V4.6.5., with the CHARMM fully atomic force-field. The production run was 50 ns. The trajectories related to each complex were used for analyzing MD results by Grace software in the Bio-Linux 8 operation system. The binding free energy of the simulated CelC307-ligand complexes was computed using the molecular mechanics/Poisson-Boltzmann surface area (MM/PBSA) method. 100 snapshot frames obtained from 50 ns of simulation, and the binding free energy (ΔG binding ) was calculated according to the Eq. (1): where ΔG solvation is the sum of electrostatic (G polar ) and non-electrostatic (G apolar ) results of complexes.
Cloning, expression, and purification of Cohnella sp. A01 CelC307. A colony of Cohnella sp. A01 was added to nutrient broth medium, cultured at 55 °C for 72 h, and centrifuged (8000×g, 20 min at 4 °C), then the harvested cells were used for DNA extraction. The predicted cellulase C307 gene was amplified by PCR, using the forward and reverse primers that incorporated NdeI and XohI restriction sites, respectively (For; 5′-GGA ATT CCAT ATG GGA GAT TTG GAA CGG CTT C-3′, and Rev; 5′-CCG CTC GAG CCT TGC GCC CGC ATG-3′). The reverse primer excluded the stop codon that led to incorporating the polyhistidine-tag located at the C-terminal of CelC307. The amplified fragment was cloned into the expression vector pET26b(+). The recombinant plasmid was transformed into E. coli BL21 (DE3) strain using the heat shock transformation method. The positive plasmids were identified by restriction enzymes digestion, PCR, and sequencing.
Kanamycin solution at the final concentration of 30 mg/ml was added to the LB medium. A single recombinant colony was dissolved in LB medium, after 16 h grown at 37 °C transferred into a fresh LB medium containing 1 mM IPTG, and were incubated at 37 °C to reach OD600 = 0.6. The cells were pelleted by centrifugation, resuspended in 4 ml of lysis buffer (50 mM), and subjected to sonication (4 cycles of 45 s pulse with one min interval on ice). The obtained supernatant from centrifugation of crude cell extracts (9000×g, 45 min at 4 °C) was used for protein purification. The purification step was performed via two different methods; (I) purification with Ni-NTA sepharose column chromatography was accomplished according to the procedures previously described 47 . (II) Purification with a heat shock method; the acquired supernatants were incubated at 70 °C for 10 min, followed by 10 min centrifugation at 10,000×g. The purified proteins were assayed for cellulase activity and subjected to SDS-PAGE (12%) for visualization. The Protein concentration was estimated using the Bradford's method 86 . CelC307 activity and substrate specificity. According to the method described by Miller, cellulase activity is determined by the estimation of reducing sugars that release from the carboxymethylcellulose (CMC) substrate by using dinitrosalicylic acid (DNS) reagent 87 . Different concentrations of commercial cellulase enzyme and 1% (w/v) CMC substrate were dissolved in potassium phosphate buffer (pH 7) and were used to plot the standard curve. The enzyme-free reaction was used as a negative control. The enzymatic reactions were incubated at 40 °C for 5 min, then terminated by an equal quantity of DNS. 10 min incubation in the boiling water led to an alkaline mixture of reducing sugars with DNS, and a red color developed. The color intensity has a linear relationship with the concentration of reducing sugars. One unit of the cellulase enzyme is described as the enzyme's quantity that liberates one μmol of reducing sugar per min.
The specificity of the CelC307 was determined by testing 1% (w/v) of Laminarin, Pustualn, chitin, and pectic acid as the substrate. We analyzed all reactions in triplicate.
Zymography and protease digestion resistance. According  www.nature.com/scientificreports/ Resistance to proteolytic digestion was tested by two proteases; trypsin and proteinase k. One mg/ml of proteases and 0.6 mg/ml of CelC307 were mixed in a ratio of 20:1 (v/v) and incubated at 37 °C for 30 min. Digestion of bovine serum albumin (BSA) with the same ratio was used as the positive control. The results of digestion reactions were analyzed with SDS-PAGE.
Michaelis-Menten kinetics. The affinity of CelC307 for CMC substrate was determined by calculating the enzyme activity in the presence of an increasing concentration of substrate (0.625 to 20 mg/ml), and the Michaelis-Menten equation was fitted to the data. The CelC307 kinetic parameters (K m , V max , k cat , and k cat /K m ) were calculated by applying GraphPad Prism V.8 with nonlinear regression analysis using the Michaelis-Menten model.

Optimization of expression.
Statistical multivariate optimization (SMO) was used to obtain the optimal response (here CelC307 enzyme activity) by statistical analysis. The SMO was performed in two steps. Firstly, 12 sets of trials were conducted based on Plackett-Burman design in the screening experiment to determine the most critical factors affecting the CelC307 enzyme activity 89 . All factors were examined at two levels and consisted of; pH (3 and 8), temperature (20 and 40 °C), rotation (70 and 200 rpm), optical density (0.3 and 1), induced time (4 and 18 h), yeast extract (2.5 and 7.5 g/l), and Trypton (7.5 and 12.5 g/l). In the next step, the obtained key factors (here pH, temperature, and inoculation concentration) were further optimized using the Box-Behnken method to reach the best possible levels of the factors for the highest CelC307 enzyme activity 90 . Twenty experimental trials with six replicates at the center point were designed for the three selected factors. Each factor was investigated at five levels (between − 1 and + 1), and the response value was the CelC307 activity (unit/ml). Minitab 16 (Minitab Inc., USA, https:// www. minit ab. com/ en-us/) and Design-Expert 12 (Design-Expert Inc., USA, https:// www. state ase. com/ softw are/ design-expert/) software were applied for experimental data analysis.
Effect of temperature and pH on the CelC307 activity and stability. A temperature range of 10-100 °C, with 10 °C intervals, was applied to assess the effect of temperature on the CelC307 activity and stability. Thermal stability of the CelC307 was inspected in two ways: (I) the enzyme was pre-incubated at a temperature range of 10 to 100 °C (with 10 °C intervals) for 90 min and then assayed at optimum temperature, (II) the activity of CelC307 was assayed at 60 to 90 °C for 6 h and with one-hour intervals.
CMC 1% (w/w) substrate was dissolved in 50 mM acetate, phosphate, and glycine buffers that provided the acidic, neutral, and basic pHs in the range of 2 to 12. The enzymatic assay was performed at the optimum temperature (40 °C). Subsequently, the pH stability of the enzyme was assayed in two ways: (I) the enzyme was preincubated in a pH range of 2 to 12 for 90 min, then assayed in optimum condition, (II) the residual CelC307 activity was assayed at pHs of 4, 7 and 11 for 6 h and with the one-hour intervals. All assays were performed in triplicate.
Thermodynamic study. Arrhenius equation was used for the calculation of CelC307 activation energy (E a ‡ ) 66 , so that: α is the slope obtained from the calculation of 1/T vs. Ln [k a ], and R is the gas constant (8.314 J/K mol) k B and ℏ are the Boltzmann and Planck constants, respectively, and N is the Avogadro's number.
T is the temperature in Kelvin.
The ∆G ‡ is the changes in the Gibbs free energy of activation energy The ∆H ‡ is the changes in the enthalpy energy, www.nature.com/scientificreports/